Goto

Collaborating Authors

 completeness score




OpenCML: End-to-End Framework of Open-world Machine Learning to Learn Unknown Classes Incrementally

Parmar, Jitendra, Thakur, Praveen Singh

arXiv.org Artificial Intelligence

Open-world machine learning is an emerging technique in artificial intelligence, where conventional machine learning models often follow closed-world assumptions, which can hinder their ability to retain previously learned knowledge for future tasks. However, automated intelligence systems must learn about novel classes and previously known tasks. The proposed model offers novel learning classes in an open and continuous learning environment. It consists of two different but connected tasks. First, it discovers unknown classes in the data and creates novel classes; next, it learns how to perform class incrementally for each new class. Together, they enable continual learning, allowing the system to expand its understanding of the data and improve over time. The proposed model also outperformed existing approaches in open-world learning. Furthermore, it demonstrated strong performance in continuous learning, achieving a highest average accuracy of 82.54% over four iterations and a minimum accuracy of 65.87%.


Unsupervised Document and Template Clustering using Multimodal Embeddings

Sampaio, Phillipe R., Maxcici, Helene

arXiv.org Artificial Intelligence

We study unsupervised clustering of documents at both the category and template levels using frozen multimodal encoders and classical clustering algorithms. We systematize a model-agnostic pipeline that (i) projects heterogeneous last-layer states from text-layout-vision encoders into token-type-aware document vectors and (ii) performs clustering with centroid- or density-based methods, including an HDBSCAN + $k$-NN assignment to eliminate unlabeled points. We evaluate eight encoders (text-only, layout-aware, vision-only, and vision-language) with $k$-Means, DBSCAN, HDBSCAN + $k$-NN, and BIRCH on five corpora spanning clean synthetic invoices, their heavily degraded print-and-scan counterparts, scanned receipts, and real identity and certificate documents. The study reveals modality-specific failure modes and a robustness-accuracy trade-off, with vision features nearly solving template discovery on clean pages while text dominates under covariate shift, and fused encoders offering the best balance. We detail a reproducible, oracle-free tuning protocol and the curated evaluation settings to guide future work on unsupervised document organization.




Addressing Leakage in Concept Bottleneck Models

Neural Information Processing Systems

Concept bottleneck models (CBMs) (Chen et al., 2020; Koh et al., 2020; Y eh et al., 2020; Lage & Doshi-V elez, 2020; Wang et al., 2017) propose explicitly aligning the intermediate layers of a The concept bottleneck model (CBM) makes its prediction in two stages. While soft concepts may improve predictive performance, this improvement comes at a cost.


Investigating the Potential of Using Large Language Models for Scheduling

Jobson, Deddy, Li, Yilin

arXiv.org Artificial Intelligence

The inaugural ACM International Conference on AI-powered Software introduced the AIware Challenge, prompting researchers to explore AI-driven tools for optimizing conference programs through constrained optimization. We investigate the use of Large Language Models (LLMs) for program scheduling, focusing on zero-shot learning and integer programming to measure paper similarity. Our study reveals that LLMs, even under zero-shot settings, create reasonably good first drafts of conference schedules. When clustering papers, using only titles as LLM inputs produces results closer to human categorization than using titles and abstracts with TFIDF. The code has been made publicly available.


Multi-dimensional concept discovery (MCD): A unifying framework with completeness guarantees

Vielhaben, Johanna, Blücher, Stefan, Strodthoff, Nils

arXiv.org Artificial Intelligence

For the trustworthy application of XAI, in particular for high-stake decisions, a more global model understanding is required. To this end, concept-based methods have been proposed, which are however not guaranteed to be bound to the actual model reasoning. To circumvent this problem, we propose Multi-dimensional Concept Discovery (MCD) as an extension of previous approaches that fulfills a completeness relation on the level of concepts. Our method starts from general linear subspaces as concepts and does neither require reinforcing concept interpretability nor re-training of model parts. We propose sparse subspace clustering to discover improved concepts and fully leverage the potential of multi-dimensional subspaces. MCD offers two complementary analysis tools for concepts in input space: (1) concept activation maps, that show where a concept is expressed within a sample, allowing for concept characterization through prototypical samples, and (2) concept relevance heatmaps, that decompose the model decision into concept contributions. Both tools together enable a detailed global understanding of the model reasoning, which is guaranteed to relate to the model via a completeness relation. Thus, MCD paves the way towards more trustworthy concept-based XAI. We empirically demonstrate the superiority of MCD against more constrained concept definitions.


On Concept-Based Explanations in Deep Neural Networks

Yeh, Chih-Kuan, Kim, Been, Arik, Sercan O., Li, Chun-Liang, Ravikumar, Pradeep, Pfister, Tomas

arXiv.org Machine Learning

Deep neural networks (DNNs) build high-level intelligence on low-level raw features. Understanding of this high-level intelligence can be enabled by deciphering the concepts they base their decisions on, as human-level thinking. In this paper, we study concept-based explainability for DNNs in a systematic framework. First, we define the notion of completeness, which quantifies how sufficient a particular set of concepts is in explaining a model's prediction behavior. Based on performance and variability motivations, we propose two definitions to quantify completeness. We show that under degenerate conditions, our method is equivalent to Principal Component Analysis. Next, we propose a concept discovery method that considers two additional constraints to encourage the interpretability of the discovered concepts. We use game-theoretic notions to aggregate over sets to define an importance score for each discovered concept, which we call ConceptSHAP. On specifically-designed synthetic datasets and real-world text and image datasets, we validate the effectiveness of our framework in finding concepts that are complete in explaining the decision, and interpretable.